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Abstract — Nonlinear  optimization  under  nonlinear  constraints  is  usu¬ 
ally  difficult.  However,  standard  ad-hoc  search  techniques  may  work  suc¬ 
cessfully  in  some  cases.  Here,  we  consider  an  augmented  Lagrangian  for¬ 
mulation,  and  we  develop  a  “projection  heuristic”  that  “guides”  the  itera¬ 
tive  search  toward  the  optimum.  We  demonstrate  the  effectiveness  of  this 
approach  by  applying  it  to  the  problem  of  maximizing  a  circuit-switched 
communication  network’s  throughput  under  quality-of-service  (QoS)  con¬ 
straints  by  means  of  choosing  the  input  offered  load.  This  problem  is  useful 
for  “sizing”  the  network  capacity.  Performance  results  using  several  ver¬ 
sions  of  the  algorithm  demonstrate  its  robustness,  in  terms  of  its  accuracy 
and  convergence  properties. 

Index  Terms — Admission  control,  communication  network,  optimiza¬ 
tion,  performance  evaluation,  quality-of-service  (QoS). 


I.  Introduction 

Nonlinear  optimization  problems  with  multiple  nonlinear  constraints 
are  often  difficult  to  solve,  because  although  the  available  mathemat¬ 
ical  theory  provides  the  basic  principles  for  solution,  it  does  not  guar¬ 
antee  convergence  to  the  optimal  point  [1],  The  straightforward  appli¬ 
cation  of  augmented  Lagrangian  techniques  to  such  problems  typically 
results  in  slow  (or  lack  of)  convergence,  and  often  in  failure  to  achieve 
the  optimal  solution.  In  this  technical  note,  we  introduce  a  “projection 
heuristic”  that  “guides”  the  iterative  search  more  directly  and  more  ro¬ 
bustly  to  the  optimal  solution. 

We  illustrate  the  effectiveness  of  this  heuristic  by  applying  it  to  a 
problem  that  arises  in  communication  networks,  namely  the  maximiza¬ 
tion  of  throughput  in  multihop,  circuit-switched  networks  that  are  sub¬ 
ject  to  quality-of-service  (QoS)  constraints  on  blocking  probability. 
The  objective  is  to  determine  the  offered-load  profile  that  maximizes 
throughput,  for  specified  routing  and  admission-control  policies.  This 
problem  is  useful  for  “sizing”  the  network  capacity,  i.e.,  for  deter¬ 
mining  the  maximum  throughput  that  can  be  supported  by  the  network, 
subject  to  QoS  constraints  [2],  Issues  related  to  speed  of  convergence 
and  quality  of  solution  are  addressed.  Several  versions  of  the  algorithm 
are  defined,  and  performance  results  are  presented  to  illustrate  their  ro¬ 
bustness. 


II.  The  Optimization  Problem 


1 )  Constrained  Optimization  Problem: 


max)  5(A))  ( 1 ) 

subject  to:  /’.(A:  <  Q,  1  <  j  <  .7 

where  A  =  ( A  i . . . . .  A  j)  is  a  J-dimensional  input  vector,  the  perfor¬ 
mance  measure  5(A)  is  a  nonlinear  function  of  the  input  vector,  and 
the  Qj  are  the  values  of  the  constraints  imposed  on  nonlinear  functions 
Pj(  A)  of  the  input  vector.  For  example,  in  Section  III  we  consider  a  cir¬ 
cuit-switched  networking  example  in  which  A  j  represents  the  offered 
load  to  circuit  ),  5(A)  is  throughput,  and  Pj(  A)  is  the  probability  that 
an  incoming  call  to  circuit  j  is  blocked. 

Definitions: 

•  We  say  that  an  input  vector  A  is  admissible  if  the  constraints  are 
satisfied. 

•  The  admissible  region  contains  all  input  vectors  that  are  admis¬ 
sible. 

•  Corresponding  to  each  admissible  vector  A  is  a  value  of  admis¬ 
sible  performance. 

We  convert  our  constrained  optimization  problem  to  an  uncon¬ 
strained  one  by  using  the  augmented  Lagrangian  function  [1]  given  by 
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Our  goal  is  to  maximize  L{  ■ )  over  A.  To  do  this,  we  use  the  iterative 
procedure 


A  j  (k  +  1 )  =  max 


^  Amin  •  A  j  { k )  +  9  (k) 
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where  9(k)  is  a  stepsize  parameter,  and 


7=1 


dPp 

d\: 


[d(Qj-Pj(  A))-7j].  (4) 


The  Lagrange  multipliers,  7;,  are  updated  according  to 

7,  ( k  +  1 )  =  7j  ( k)  -  1  ( Pj ( A )  >  Qj )  C(Q-'~P'(A))  (5) 


We  are  interested  in  nonlinear  optimization  problems  with  multiple 
nonlinear  constraints.  In  this  section,  we  use  Lagrangian  techniques  to 
formulate  the  basic  optimization  problem. 
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where  c  is  a  positive  constant  and  7,(0)  =  7 c,.j  =  1. ....  J.  The 
forms  of  the  gradients  of  5  and  P,  are  problem  specific.  A  variety  of 
rules  we  have  used  for  updating  the  Lagrange  multipliers  and  stepsize 
parameter  are  discussed  in  [2].  We  refer  to  the  straightforward  applica¬ 
tion  of  the  updating  rule  defined  by  (3),  (4),  and  (5)  as  the  “basic  search 
technique.” 

III.  Motivation  for  This  Formulation:  A  Networking  Problem 

We  consider  a  circuit-switched  network  with  predetermined  paths 
between  each  pair  of  source  and  destination  nodes  throughout  the  du¬ 
ration  of  each  accepted  session  (e.g.,  voice  call).  We  assume  the  usual, 
“blocked  calls  cleared,”  mode  of  operation,  i.e.,  unless  sessions  are  ac¬ 
cepted  for  immediate  transmission,  they  are  “blocked”  and  lost  from 
the  system.  Appropriate  performance  measures  for  this  mode  of  oper¬ 
ation  include  blocking  probability  and  throughput. 
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We  consider  J  source-destination  pairs,  each  of  which  is  assigned 
a  fixed  multihop  path  (circuit)  through  the  graph  of  the  network  that 
interconnects  them.  We  let  xj  (which  may  be  greater  than  1)  denote 
the  number  of  sessions  that  are  ongoing  on  the  j  th  such  circuit,  and  we 
assume  that  each  accepted  session  consumes  a  fixed  amount  of  resource 
throughout  its  duration,  i.e„  a  fixed  unit  of  bandwidth  is  required  over 
each  link  in  the  circuit  to  support  each  session.  The  state  of  the  system 
is  the  ./-dimensional  vector  x  =  (xi, . . .  ,xj). 

The  capacity  of  network  element  (link  or  node)  i  is  denoted  by  T,. 
In  the  wired  case,  T,  is  the  number  of  channels  supported  by  link 
if  i  =  1, . . . .  M,  where  M  is  the  number  of  links  in  the  network.  In 
the  wireless  case,  T,  is  the  number  of  transceivers  at  node  i,  and  M  is 
the  number  of  nodes  in  the  network.1  Each  network  element  can  sup¬ 
port  sessions  corresponding  to  several  circuits  simultaneously,  as  long 
as  the  state  variables  x  \ ,  xo , . . . ,  x  j  satisfy  sets  of  linear  constraints  of 
the  form 


xj  <  Ti ,  1  =  1,...,  M  (6) 

jtfi 

where  I,  is  the  set  of  circuits  that  share  network  element  i . 

A.  Admission  Control 

Our  ultimate  goal  is  to  achieve  optimal  network  performance,  which, 
however,  depends  on  a  large  number  of  factors,  notably  routing,  ad¬ 
mission  control,  and  offered  traffic.  In  [3]  and  [4],  we  approached  this 
problem  by  exercising  an  admission-control  policy  on  calls,  under  the 
assumption  that  routes  and  offered  loads  on  each  of  the  circuits  were 
fixed.  In  this  note,  we  again  fix  the  routes,  but  instead  of  determining 
the  best  admission-control  policy  for  a  fixed  offered  load,  we  determine 
the  offered  load  that  maximizes  throughput  for  a  fixed  admission-con¬ 
trol  policy,  subject  to  QoS  constraints  on  blocking  probability. 

We  restrict  our  admission  control  policies  to  the  class  of  “threshold” 
policies.  Threshold  controls  restrict  the  number  of  calls  that  will  be 
admitted  to  the  individual  circuits,  and  can  be  expressed  as 

x3  <  V  .  1  <  j  <  J  (7) 

where  Xj  is  the  threshold  on  circuit  j.  Transceivers  are  not  assigned  a 
priori  to  circuits;  sessions  are  accepted  as  long  as  the  threshold  values 
(the  Xj’ s)  are  not  exceeded.  In  [3]  and  [4],  we  also  studied  “linear- 
combination”  controls. 

Policies  that  use  threshold  and/or  linear-combination  controls  are  a 
subclass  of  the  “coordinate-convex”  policies  [5],  A  stationary  admis¬ 
sion-control  policy  is  specified  in  terms  of  the  set  of  allowable  states 
f 2.  A  new  call  is  admitted  if  the  state  to  be  entered  is  in  the  allow¬ 
able  region;  otherwise,  it  is  blocked  and  lost  from  the  system.  Coor¬ 
dinate-convex  control  policies  are  used  because  they  provide  a  form 
of  intelligent  resource  sharing  without  the  complexity  of  dynamic  pro¬ 
gramming. 

B.  The  Solution  and  Performance  Measures 

We  assume  Poisson  arrival  statistics,  and  denote  the  offered  load 
vector  by  A  =  ( Ai,  A$, . . . ,  Aj ),  where  A;  is  the  arrival  rate  to  cir¬ 
cuit  j.  The  service  rate  vector  is  /x  =  i  // . // :  where  we  let 

/ij  =1  (1  <  /  <  J).  Thus,  the  corresponding  offered  load  on  circuit 
j  is  p,  =  A,.  Furthermore,  control  is  centralized,  and  the  resources 
needed  to  support  a  circuit  are  acquired  simultaneously  when  the  call 
arrives  and  are  released  simultaneously  when  the  call  is  completed. 

'For  a  wireless  network  model,  this  translates  to  the  assumption  that  each 
node  has  several  transceivers,  and  that  each  session  requires  the  use  of  one  trans¬ 
ceiver  at  every  node  in  its  path;  FDMA  can  then  be  conveniently  assumed  for 
channel  access,  provided  that  there  is  sufficient  bandwidth  for  all  transceivers 
to  operate  simultaneously  at  noninterfering  frequencies. 


Calls  are  blocked  when  one  or  more  nodes  along  the  path  do  not  have 
a  transceiver  available  or  when  a  decision  is  made  not  to  accept  a  call, 
i.e.,  to  accept  the  call  would  bring  the  system  state  outside  the  region 
defined  by  admission-control  policy  f 2.  Under  these  conditions,  in  con¬ 
junction  with  the  use  of  coordinate-convex  policies,  it  has  been  shown 
[6],  [7]  that  the  system  state  has  the  product-form  stationary  distribu¬ 
tion.2  For  any  allowable  state  space  Q,  it  is  straightforward  (though 
time  consuming)  to  evaluate  the  normalization  constant,  which  in  turn 
permits  the  evaluation  of  performance  measures  such  as  throughput  and 
blocking  probability,  which  we  define  as  follows: 


Sj(  A)  =  throughput  on  circuit  j 

=  Ai(l-Pl(\))  (8) 

j 

S  ( A )  =  total  throughput  =  E  Sj{  A) 

./=  i 

=  A(1  -  Pav(A))  (9) 

Pav  ( A )  =  overall  blocking  probability 

=  ExPj(A)-  (10) 

j=i 


where  Pj( A)  is  the  probability  that  an  incoming  call  to  circuit  j  is 
blocked  and  A  =  X/  ,=i  'V  ls  t^le  overall  arrival  rate. 

The  circuit  blocking  probabilities  P;( A),  the  circuit  throughput 
values  .S'; (A),  and  the  partial  derivatives  (gradients)  dPj( A)/3A; 
and  dSj( A)/3A;,  which  are  used  in  the  Lagrangian  update  equation 
(4),  are  obtained  from  the  product-form  solution.  In  [6],  Jordan  and 
Varaiya  showed  that 


dP,(A) 

d\i 

dSj  ( A) 
<5A, 


f  \'  \  < -o v i. i  /  j 
\  f f(E{x;}  -  var(.r,)).  i  =  j 

ft  ,  . 

— cov (.!•;,  Xj  ). 


and 


(11) 


IV.  Guided  Search  Techniques 

When  the  basic  search  technique  is  applied  to  the  networking 
problem  of  Section  III,  significant  (although  nonmonotonic)  progress 
is  typically  made  in  the  early  stage  of  the  search,  whereas  consider¬ 
ably  less-productive  oscillatory  behavior  is  observed  as  the  search 
progresses.  Moreover,  the  quality  of  the  solution  is  often  sensitive  to 
the  starting  point  of  the  search.  A  common  difficulty  in  constrained 
optimization  problems  arises  because  the  optimum  lies  on  the  search 
boundary  (i.e.,  one  or  more  of  the  circuit  blocking  probabilities  is  at 
the  maximum-permitted  QoS  value).  In  unconstrained  optimization 
problems,  gradient  search  procedures  are  naturally  slowed  (smaller 
steps)  by  the  decreasing  gradient  as  they  approach  the  optimum. 
This  slowing  allows  the  search  to  ascend  smoothly  to  the  maximum. 
However,  when  the  optimum  lies  on  the  boundary,  as  it  often  does 
in  constrained  problems,3  there  is  not  necessarily  a  decrease  in  the 
gradient  in  its  neighborhood.  In  this  case,  typical  gradient  search 
techniques  rely  on  damping  of  the  stepsize  8  to  cause  the  search  to 
slow  and  home  in  on  the  optimum.  However,  an  overly  rapid  decrease 
in  8  results  in  failure  to  reach  the  optimal  solution,  whereas  a  less 
rapid  decrease  in  8  can  result  in  unacceptably  slow  convergence. 

2It  is  not  necessary  to  assume  that  the  call  duration  is  exponential.  A 
Poisson  arrival  process  and  general  service  time  distribution  is  sufficient  for 
the  product-form  solution  to  apply  [8];  knowledge  of  the  means  of  the  service 
times  provides  enough  information  to  determine  the  equilibrium  distribution. 

3We  believe  that,  in  our  optimization  problem,  at  least  one  of  the  circuit 
blocking  probabilities  at  the  optimal  point  must  be  at  the  maximum  permitted 
value.  This  conjecture  is  supported  by  extensive  empirical  evidence  in  a  variety 
of  network  examples.  We  have  observed  that,  typically,  between  half  and  all  of 
the  circuit  blocking  probabilities  are  at  the  maximum  permitted  value. 
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The  most  interesting,  and  troublesome,  behavior  occurs  when  the 
search  trajectory  passes  near  the  QoS  constraint  contour.  The  violation 
of  a  constraint  results  in  oscillatory  behavior  with  little  progress  toward 
the  optimal  point.  The  desired  behavior  would  be  for  the  search  to  pro¬ 
ceed  along  the  contour  corresponding  to  the  QoS  constraint,  rather  than 
at  a  significant  angle  to  it.  We  have  attempted  to  mitigate  the  oscilla¬ 
tory  behavior  of  the  basic  search  technique  by  using  our  knowledge  of 
the  throughput  and  blocking  probability  gradients  to  guide  the  search 
more  efficiently. 

A.  Guiding  the  Search:  Preliminary  Approach 

To  illustrate  the  principle  of  guided  search,  we  consider  an  example 
in  which  the  blocking  probability  of  the  “dominant  circuit”  (i.e.,  the  cir¬ 
cuit  with  the  largest  blocking  probability)  is  close  to  (say  within  some 
s  of)  the  specified  QoS  value.  We  would  like  to  guide  the  search  in  a 
direction  of  increasing  throughput,  so  that  it  tends  to  proceed  parallel 
to  the  contour  at  which  the  blocking  probability  is  at  the  specified  QoS 
value.  To  simplify  the  discussion,  let  us  first  consider  the  case  in  which 
exactly  one  of  the  circuit  blocking  probabilities  (the  dominant  circuit)  is 
located  within  £  of  the  QoS  constraint,  i.e.,  Q,  —  e  <  Pj(  A)  <  Qj+e, 

for  exactly  one  value  of  j  g  {1,2 . 7) .  Let  us  call  this  circuit  c.  In 

this  case,  we  would  like  the  search  to  proceed  along  the  component  of 
the  throughput  gradient  V  S  that  is  orthogonal  to  the  circuit  blocking 
probability  gradient  VP„  at  our  current  point  in  the  search.  By  elim¬ 
inating  the  component  parallel  to  V  P,. ,  we  discourage  increase  in  the 
blocking  probability  of  the  dominant  circuit.  The  desired  projection  can 
be  written  as 


{Component  of  VS  orthogonal  to  VP,  j 

=  VS  - 


VS  •  VP„ 
IIVPcII2 


VPC 


(12) 


Fig.  1 .  D  =  component  of  V S  that  is  orthogonal  to  VPS. 


B.  Guiding  the  Search:  Generalized  Approach 

The  use  of  the  projection  operation  described  above  removes  the 
component  of  VS  that  is  parallel  to  VPr .  By  doing  so,  we  update  A  in 
a  direction  that  increases  throughput  without  increasing  Pc .  However, 
the  typical  consequence  of  doing  so  is  that  one  or  more  of  the  other 
circuits  will  soon  violate  the  QoS  constraint.  At  a  typical  point  in  the 
search,  it  is  common  for  several  circuits  to  violate  the  QoS  constraint  or 
to  be  sufficiently  close  to  the  QoS  boundary  that  the  QoS  constraint  is 
in  danger  of  being  violated.  For  example,  we  have  observed  behavior 
in  which  the  chosen  circuit  for  the  projection  alternates  among  two 
or  three  of  the  circuits,  resulting  in  oscillatory  behavior  in  which  little 
progress  is  made  toward  the  optimal  solution.  To  mitigate  this  behavior, 
we  have  considered  a  generalized  form  of  the  projection  operation  in 
which  several  circuits  are  included  in  the  projection.  The  inclusion  of 
several  circuits  takes  into  consideration  the  fact  that  we  are  dealing  with 
a  number  of  constraints  simultaneously.  Thus,  we  would  like  to  update 
A  in  a  direction  that  discourages  violation  of  any  of  the  QoS  constraints. 

To  incorporate  the  QoS  constraints  associated  with  some  or  all 
of  the  circuits  into  the  search-guiding  mechanism,  we  introduce  the 
quantity  Ps,  which  is  a  function  of  the  circuit-blocking  probabilities 
Pi,  Pa, ....  Pj.  In  this  note,  we  have  used  the  following  simple,  linear 
form  for  TV : 


(16) 


where 


j 


VS  •  VPC  =  ^2 

i=  I 


os  ap. 
av  aX7 


dSj  8PC 

aX7M7 

,-i  j=i 


(13) 


and  ||X|]  =  y2^/=i  UfJ-  is  the  norm  of  the  vector  X.  Then  we  intro¬ 
duce  a  vector  D  =  ( D  i .  PL .....  P  ,/ )  (see  Fig.  1),  which  is  equal  to 
this  projection  when  the  blocking  probability  of  the  dominant  circuit  is 
located  in  a  band  of  width  2s  centered  about  the  QoS  contour;  other¬ 
wise,  D  is  equal  to  the  throughput  gradient  V  S 


D  = 


VS- 

vs, 


vs  •  VP, 
HVPelP 


VP„. 


QoS  —  £  <  Pc  <  QoS  +  £ 


otherwise. 


(14) 


We  modify  the  Lagrangian  objective  function  of  (4)  by  inserting  D , 
in  place  of  dSj'd A;  as  follows: 

1'  -»/Q:A:>  Q. 

*  3  =  1 

x.,A:[pa  p (is) 


where  S  is  a  subset  of  {1.  2. ....  ,7) .  The  vector  D,  introduced  in  (14), 
is  then  rewritten  as  (17),  as  shown  at  the  bottom  of  the  page.  The  pro¬ 
jection  vector  D  specified  by  (17)  removes  the  component  of  VS  that 
is  in  the  direction  of  the  gradient  of  the  average  blocking  probability  of 
the  circuits  included  in  V.  This  expression  is  identical  to  that  of  (14), 
except  that  Ps  replaces  P„  in  the  dot  products,  and  that  the  projection 
operation  is  used  only  when  the  resulting  value  of  [|T5[|  is  sufficiently 
large.  The  reason  for  using  the  projection  operation  only  when  it  pro¬ 
vides  a  sufficiently  large  value  of  1 1 D  \  |  is  based  on  our  experimental  ob¬ 
servation  that  (in  some  cases)  the  trajectory  can  reach  a  point  at  which 
||T3|!  is  quite  small.  This  behavior  results  in  slow  progress  toward  the 
optimal  point,  or  even  virtually  total  stopping  of  the  trajectory,  resulting 
in  premature  convergence;  in  fact,  the  trajectory  can  converge  to  a  point 
interior  to  the  admissible  region  (thus  none  of  the  circuit  blocking  prob¬ 
abilities  are  at  the  specified  value,  a  condition  not  characteristic  of  the 
optimal  point).  This  behavior  is  especially  prevalent  when  the  set  V 
is  large  (e.g.,  we  have  considered  cases  in  which  V  contains  all  .7  cir¬ 
cuits).  It  occurs  when  the  gradients  of  S  and  Ps  are  nearly  parallel 
to  each  other.  Turning  off  the  projection  operation  (typically  for  just  a 
single  iteration)  permits  the  trajectory  to  escape  from  such  undesirable 
points.  We  have  found  that  a  value  of  r  =  0.1  works  well. 

Care  must  also  be  taken  in  the  choice  of  several  other  parameters 
used  in  the  algorithm,  such  as  the  choice  of  c  (used  in  updating  the 
Lagrange  multipliers  in  (5)  and  d  (which  weights  the  penalty  term  in 
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VS, 


VS  •  V7V 

II VPQP 


VJV. 
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otherwise 
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(15).  The  use  of  c  =  d  =  50  worked  well  for  high  values  of  QoS  (e.g., 
>0.2),  but  not  for  more  realistic  values.  The  relatively  poor  perfor¬ 
mance  for  low  values  of  QoS  was  observed  because  the  gradient  terms 
8Pj  /d\  ;  were  too  small  to  drive  the  search  back  into  the  admissible 
region  at  the  low  offered  loads  that  are  characteristic  of  low  values  of 
QoS.  We  have  observed  experimentally  that  this  problem  can  been  mit¬ 
igated  by  weighting  the  constraint-violation  terms  by  1/  \/Q]  (while 
maintaining  c  =  d  =  50)  as  follows: 


dL(X.'y) 

dX i 


D,  +  a  ^  1  (Pj(X)  >  Q, ) 

j=  i- 

x  dPj  [d(Q j  -  Pj(X))  -  ->,■] 

dx;  x  Q, 


(18) 


where  a  is  a  “kick-up”  factor  that  can  be  updated  (increased  from  an 
initial  value  of  1)  as  necessary,  e.g.,  a  can  be  increased  if  too  many 
consecutive  inadmissible  solutions  are  observed,  or  decreased  if  too 
many  consecutive  admissible  solutions  are  observed  (after  the  inad¬ 
missible  region  has  been  entered  at  least  once).  The  incorporation  of 
these  heuristic  fixes  into  the  update  equation  has  resulted  in  a  robust 
algorithm  that  does  not  require  the  fine-tuning  of  parameters. 


V.  Alternative  Versions  of  the  Algorithm 

We  have  studied  several  versions  of  the  algorithm  based  on  (18), 
which  differ  in  their  use  of  the  dot-product  projection  and  in  the  step- 
size  update  rule.  Here,  we  briefly  describe  one  of  our  approaches;  a 
complete  discussion  is  provided  in  [2],  In  our  discussion,  it  is  implic¬ 
itly  assumed  that  Qj  =  constant,.)  =  1, . . . ,  .7,  (i.e.,  that  all  circuits 
are  subject  to  the  same  constraint  on  maximum  blocking  probability), 
although  it  is  certainly  possible  to  define  projection  rules  that  incorpo¬ 
rate  different  QoS  values  (see  [2]). 

A.  Projection  Rules 

The  projection  rule,  as  described  in  Section  IV-B,  guides  the 
search  by  removing  the  component  of  the  throughput  gradient  that 
is  parallel  to  VP^,  where  Ps  =  Pj,  for  some  subset  E  of 

{ 1,  2. . . . ,  .7) .  The  effect  of  the  projection  is  to  remove  the  component 
of  the  throughput  gradient  that  is  in  the  direction  of  the  gradient  of 
the  sum  of  the  blocking  probabilities  (or,  equivalently,  the  gradient 
of  the  average  blocking  probability)  of  the  circuits  included  in  E.  By 
including  several  circuits  in  E,  it  is  possible  to  discourage  (although 
not  necessarily  prevent)  the  blocking  probabilities  of  these  circuits 
from  exceeding  the  QoS  constraint  value.  In  addition,  the  oscillatory 
behavior  that  results  from  the  use  of  a  single  circuit  (the  identity  of 
which  typically  alternates  among  a  small  set  of  circuits)  in  the  dot 
product  is  reduced.  However,  it  must  be  acknowledged  that  the  use 
of  the  projection  is  a  heuristic  approach.  The  performance  results 
presented  in  Section  VI  and  [2]  demonstrate  that,  if  used  judiciously, 
the  projection  can,  in  fact,  be  very  helpful. 

In  most  versions  of  the  algorithm  studied  in  the  core  runs  of  [2],  we 
used  a  version  of  the  projection  rule  in  which  E  is  defined  as  follows: 

V  =  [  /  :  Pj  >  /An  i n  -}-  ^(Pmax  /An  i  n  )  j  (19) 

where  pmin  =  min{P,../  =  1,  2, , . . ,  .7).  pmax  =  max{Pj,j  = 
1.  2. ....  .7),  and  v  G  [0. 1].  The  parameter  v  can  be  chosen  to  include 
either  few  or  many  circuits,  as  desired.  For  example,  for  the  network 
discussed  in  this  note,  the  choice  of  v  =  0.2  causes,  on  the  average, 
about  eight  (out  of  ten)  circuits  to  be  included  in  E  (thus  E  is  a  large 
set).  Alternative  choices  for  the  set  E  are  considered  in  [2]. 


We  have  observed  that  the  use  of  a  large  set  E  tends  to  keep  the  tra¬ 
jectory  well  inside  the  admissible  region  during  the  early  phase  of  the 
search,  and  discourages  the  trajectory  from  straying  too  far  into  the  in¬ 
admissible  region  once  the  QoS-constraint  boundary  has  been  crossed. 
However,  although  the  neighborhood  of  the  optimal  point  is  reached 
rapidly,  it  is  common  for  the  trajectory  to  proceed  past  it,  eventually 
converging  to  a  point  relatively  far  from  the  optimal.  Apparently,  the 
algorithm  does  not  converge  to  the  true  optimal  point  because  of  the 
distortion  introduced  by  the  use  of  D  rather  than  VS. 

Based  on  these  observations,  which  have  been  supported  by  exten¬ 
sive  numerical  results,  we  have  concluded  that  it  is  often  best  to  use  a 
large  set  E  during  the  early  phase  of  the  search,  and  then  to  turn  off  the 
projection  term  (i.e.,  set  E  =  0,  the  empty  set)  at  some  point  during 
the  search.  When  the  projection  is  turned  off,  the  final  approach  to  the 
optimal  solution  can  be  made  without  the  presence  of  distortion. 

B.  Stepsize  Considerations 

Typically,  we  have  chosen  the  initial  stepsize  do  on  the  basis  of  a 
short  pilot  run  in  which  the  projection  is  not  used;  it  is  chosen  so  that, 
starting  at  X,  =  0,  the  trajectory  exits  the  admissible  region  for  the  first 
time  after  about  five  to  fifteen  iterations.  The  same  value  of  do  is  used 
whether  or  not  the  projection  is  used  in  the  actual  search. 

We  have  found  that  a  first  exit  point  of  five  iterations  works  well  for 
large  values  of  the  QoS  constraint,  e.g..  0.3.  However,  this  approach  ap¬ 
pears  to  produce  an  excessively  large  initial  stepsize  for  small  values, 
e.g.,  0.001.  Thus,  in  some  of  our  examples  for  Q ,  =  0.001  we  have 
used  an  initial  value  of  d  that  is  half  that  produced  by  using  the  rule 
based  on  exiting  the  admissible  region  for  the  first  time  at  the  fifth  itera¬ 
tion.  To  explain  the  difference  in  behavior,  consider  the  terms  dS;  / OX; 
derived  from  (4),  which  are  usually  significantly  larger  than  the  terms 
dSj/dXj,  when  j  /  i.  These  “diagonal”  terms  have  a  value  close 
to  1  at  the  low  offered  loads  that  are  characteristic  of  low  blocking 
probability;  however,  these  terms  are  considerably  smaller  at  offered 
loads  characteristic  of  significantly  higher  blocking  probability  (typ¬ 
ical  average  values  are  approximately  0.3  in  many  of  our  examples 
for  Qj  =  0.3).  Thus  the  use  of  smaller  stepsizes  at  low  QoS  values 
compensates  for  the  larger  values  of  throughput  gradient  at  the  corre¬ 
sponding  offered  loads. 

VI.  Performance  Results  for  a  Networking  Example 

In  this  section,  we  discuss  the  performance  of  the  search  algorithms 
in  terms  of  the  evolution  of  the  admissible  throughput  as  the  search 
progresses.  We  refer  to  the  version  that  uses  the  projection  in  the  first 
phase,  simply  as  the  “projection  algorithm.”  We  also  present  results 
for  the  basic  search  technique,  which  does  not  use  the  projection  at  all. 
Both  of  these  versions  are  based  on  a  stepsize  rule  in  which  d  is  constant 
for  100  iterations,  then  decreases  exponentially  to  0. 1  of  its  initial  value 
after  an  additional  100  iterations,  and  then  decreases  exponentially  to 
0.001  of  its  initial  value  after  an  additional  800  iterations.  In  the  version 
with  the  projection  algorithm,  u  =  0.2  is  applied  for  the  first  100  iter¬ 
ations;  the  projection  operation  is  turned  off  by  setting  E  =  0  for  the 
next  900  iterations.  Alternative  stepsize  and  projection  rules  are  dis¬ 
cussed  in  [2], 

Fig.  2  shows  the  “admissible”  throughput  (i.e.,  values  are  not  shown 
when  the  QoS  constraint  is  violated)  for  both  the  basic  search  tech¬ 
nique  and  projection  algorithm  for  the  case  of  Network  1  with  T,  = 
6, Xj  =  4,  and  Qj  =  0.3(1  <  j  <  .7). 4  This  example  is  typical,  in 

4This  “unrealistically  high”  value  of  blocking  probability  was  used  because 
it  typically  results  in  a  more-difficult  optimization  problem  than  lower  values 
(e.g.,  Qj  =  0.001),  in  the  sense  that  a  greater  number  of  iterations  is  usually 
needed. 
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Fig.  2.  Evolution  of  admissible  throughput;  Qj  =  0.3.  (a)  Basic  search 
technique,  (b)  Projection  algorithm. 

that  the  use  of  the  projection  operation  provides  a  smoother  ascent  to 
good  throughput  values  early  in  the  search,  and  hence  typically  faster 
attainment  of  the  95%  and  98%  milestones5  [2],  However,  there  is  usu¬ 
ally  less  difference  in  the  speed  with  which  the  higher  milestones  are 
reached,  and  sometimes  the  basic  search  technique  reaches  them  faster. 

We  discovered  in  our  early  studies  that  the  use  of  the  projection  op¬ 
eration  often  prevents  convergence  to  the  optimal  solution,  especially 
when  a  relatively  large  number  of  circuits  are  included  in  the  projec¬ 
tion.  We  may  view  the  use  of  the  projection  term  in  the  first  phase(s)  as 
the  determination  of  an  “initial  condition”  for  the  “undistorted”  version 
of  the  algorithm  (i.e.,  the  version  without  the  projection  term).  Thus,  as 
long  as  the  trajectory  is  brought  sufficiently  close  to  the  neighborhood 
of  the  optimal  solution  in  the  early  phase(s),  the  undistorted  version  of 
the  algorithm  should  bring  the  solution  close  to  the  optimal  point  be¬ 
fore  the  end  of  the  allotted  1000  iterations. 

Even  for  network  examples  in  which  all  versions  converge  to  nearly 
the  same  point,  the  use  of  the  projection  operation  can  have  a  pro¬ 
found  impact  on  the  behavior  of  the  algorithm.  For  example,  when  a 
large  number  of  circuits  are  included  in  the  projection  set  E  (e.g.,  by 
using  a  relatively  small  value  of  v  such  as  0.2),  a  relatively  smooth 
(although  perhaps  somewhat  slow)  trajectory  is  observed  in  which  the 
throughput  increases  monotonically  to  a  large  percentage  of  the  bench¬ 
mark  throughput  value  before  exiting  the  admissible  region  for  the  first 
time.  By  contrast,  when  the  projection  operation  is  not  used,  the  trajec¬ 
tory  is  much  rougher,  with  considerably  larger  deviations  in  offered 
load  and  throughput  from  one  iteration  to  the  next.  Although  it  is  in¬ 
deed  possible  to  achieve  some  of  the  high  milestone  values  relatively 
early  in  the  run  when  the  projection  is  not  used,  it  may  be  a  matter  of 
“luck”  as  to  whether  or  not  such  points  are  indeed  found  early.  Even 
if  they  are  found,  the  trajectory  will  often  move  far  from  these  points 
because  of  the  large  stepsize.  Based  on  the  extensive  testing  discussed 
in  [2],  it  appears  that  the  smoothing  effect  of  the  projection  operation 
with  v  =  0.2  permits  the  effective  use  of  relatively  aggressive  stepsize 
rules,  thus  permitting  faster  convergence. 

Our  primary  conclusion,  obtained  by  examining  the  data  presented  in 
[2],  is  that  virtually  all  versions  of  the  algorithm  perform  well,  based  on 
the  criterion  of  providing  optimal  (or  nearly  optimal)  throughput  within 
1000  iterations.  However,  use  of  the  projection  operation  in  the  early 
part  of  the  search  can  be  beneficial.  For  example,  it  typically  results  in 
reaching  the  95%  and  98%  milestones  faster  than  is  possible  with  the 

5For  example,  the  "95%  milestone”  is  the  first  point  at  which  an  admissible 
throughput  value  as  high  as  95%  of  the  best  value  (observed  for  any  algorithm 
for  the  current  problem)  is  obtained. 


basic  search  technique,  and,  as  just  noted  it  permits  the  use  of  more 
aggressive  stepsize  rules,  which  result  in  faster  overall  convergence. 

A.  An  Observation 

One  characteristic  property  of  the  optimal  solution  in  constrained 
optimization  problems  such  as  ours  is  that  at  least  one  of  the  circuit 
blocking  probabilities  must  be  at  the  maximum  permissible  value,  i.e., 
at  Qj .  To  measure  how  close  the  individual  circuits  approach  this  value, 
we  introduce  the  normalized  circuit  blocking  probabilities 

Pj  =  Pj/Qj.  ./'  '•••«•/•  (20) 

Thus  Pj  =  1  when  P,  =  Qj. 

The  fact  that  not  all  blocking  probabilities  are  near  the  specified  QoS 
level  when  Q  j  =  0.3  is  not  surprising.  It  is  not  a  failure  of  the  algo¬ 
rithm,  but  rather  reflects  the  fact  that  the  level  of  interaction  among  the 
circuits  increases  as  offered  load  increases.6  Thus,  there  does  not  exist 
a  set  of  offered-load  values  for  which  all  blocking  probabilities  are  at 
the  maximum  permitted  QoS  value  when  that  value  is  relatively  high 
(e.g.,  0.3). 

On  the  basis  of  these  observations,  as  well  as  additional  discussion 
in  [2],  it  appears  that  whenever  the  optimal  solution  does,  in  fact,  lie 
very  close  to  the  QoS  contour  in  all  dimensions,  there  is  very  little  dif¬ 
ference  in  the  quality  of  the  solutions  produced  by  the  various  versions 
of  the  algorithm.  Also,  it  appears  that  our  algorithm  is  more  robust  in 
such  cases;  typically,  fewer  iterations  are  needed,  and  more  aggressive 
stepsize  rules  (resulting  in  faster  attainment  of  milestones)  are  usually 
successful.  Furthermore,  we  believe  that  one  can  have  more  confidence 
in  the  quality  of  the  solution  if  the  blocking  probabilities  are  all  close  to 
the  QoS  constraint  value.  In  some  cases  (particularly  when  several  of 
the  blocking  probabilities  are  far  from  the  QoS  boundary),  the  network 
designer/manager  might  want  to  run  several  versions  of  the  algorithm 
to  ensure  that  the  solution  is  close  to  the  true  optimum. 

B.  An  Alternative  QoS  Constraint:  Average  Blocking  Probability 

In  [2],  we  also  considered  an  alternative  version  of  the  QoS  con¬ 
straint  in  which  we  require  only  that  the  average  blocking  probability  in 
the  network  satisfy  this  constraint.  It  was  shown  that  relaxing  the  QoS 
constraint  in  this  manner  results  in  not  only  higher  throughput  values, 
but  also  in  considerably  faster  convergence,  even  when  the  projection 
operation  is  not  used.  Both  of  these  characteristics  are  a  consequence 
of  the  need  to  satisfy  only  a  single  average  QoS  constraint,  which  per¬ 
mits  the  set  of  offered  loads  to  trade  off  among  themselves  more  easily 
than  the  case  in  which  the  QoS  constraint  must  be  satisfied  on  each 
individual  circuit.  In  view  of  the  ability  of  the  basic  search  technique 
to  obtain  optimal  solutions  rapidly  and  reliably  without  using  the  pro¬ 
jection  operation,  we  do  not  consider  this  alternative  constraint  in  the 
present  note. 

VII.  Conclusion 

In  this  note,  we  have  addressed  the  solution  of  nonlinear  opti¬ 
mization  problems  with  multiple  nonlinear  constraints,  based  on  the 
use  of  Fagrangian  techniques  with  a  penalty  function.  We  observed 
several  shortcomings  associated  with  standard  Lagrangian  techniques. 
First,  there  was  no  guarantee  of  convergence  and  no  guarantee  of 
approaching  the  optimal  solution.  Second,  there  were  many  parameters 
that  could  be  “tuned"  and  thus  affect  the  solution.  Third,  the  direct  use 
of  standard  versions  of  the  Lagrangian  techniques  were  very  slow  and 
often  fraught  with  oscillations. 

6The  values  of  the  partial  derivatives  ( d  P  j  d\j )  used  in  the  update  equations 
are  increasing  functions  of  the  offered  load. 
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Therefore,  we  proposed  a  heuristic  modification  to  the  search  algo¬ 
rithm,  which  is  based  on  the  use  of  the  projection  of  the  gradient  on  an 
appropriate  plane  determined  by  the  constraint  surfaces,  and  found  that 
there  was  improvement  in  all  aspects  of  the  search.  If,  in  addition,  fine 
tuning  of  the  parameters  was  used,  the  resulting  results  were  indicative 
(although,  still,  not  assuring)  of  convergence  to  the  optimal  solution  at 
reasonable  speeds. 

In  this  note  we  have  applied  the  projection  algorithm  to  a  nonstan¬ 
dard  problem  in  communication  networking.  The  proposed  problem  is 
useful  and  meaningful  in  two  distinct  ways.  First,  it  establishes  a  “ca- 
pacity-like”  result  for  a  given  network  in  which  the  routes  are  fixed. 
In  other  words,  even  though  the  network  operator  normally  will  not 
choose  the  input  load  vector  (although  via  pricing  controls  even  this 
choice  can  be  implemented),  it  will  be  possible  to  predetermine  what 
the  ultimate  capabilities  of  the  network  are  for  the  chosen  set  of  routes. 
That  is,  it  will  permit  the  network  operator  to  “size”  the  network  and 
thus  enrich  the  control  capabilities  in  its  operation. 

Second,  the  optimal  routing  problem,  i.e.,  finding  the  best  routes  for 
a  given  input  load,  although  a  typical  network  operation  problem,  is 
essentially  unsolvable.  It  is  an  NP-complete  combinatorial  optimiza¬ 
tion  problem.  This  is  why  routing  in  circuit-switched  networks,  like  the 
Public  Switched  Telephone  Network,  has  been  the  object  of  study  for 
many  years  and  has  generated  a  large  number  of  suboptimal  heuristics. 
This  is  in  contrast  to  the  packet-switched,  datagram  routing  problem, 
which  is  a  well-behaved  and  essentially  solved  problem.  Therefore, 
when  a  set  of  routes  is  chosen  for  a  given  input  load,  it  is  likely  to 
be  used  for  a  period  of  time  even  if  the  input  load  changes.  Dynamic 
adjustment  of  heuristically  obtained  suboptimal  routes  on  a  short  time 
scale  is  not  feasible,  nor  does  it  make  much  sense.  Consequently,  the 
approach  we  introduce  in  this  note  permits  the  network  operator  to  es¬ 
tablish  the  maximum  throughput  this  set  of  routes  is  capable  of  carrying 
(while  meeting  the  blocking  probability  requirements),  and  thereby  es¬ 
tablish  how  much  of  a  gap  there  is  between  the  achieved  throughput  and 
the  achievable  throughput  (i.e.,  how  much  of  a  mismatch  there  is  be¬ 
tween  the  actual  input  load  and  the  actual  set  of  routes).  This  knowledge 
could  be  used,  in  fact,  as  a  criterion  for  deciding  whether  to  re-solve 


IEEE  TRANSACTIONS  ON  AUTOMATIC  CONTROL,  VOL.  47,  NO.  6,  JUNE  2002 


the  routing  problem  and  change  the  set  of  circuit  paths  of  the  network. 
Thus,  although  on  its  surface  the  problem  we  propose  may  appear  un¬ 
orthodox,  we  believe  it  offers  a  totally  novel  tool  for  network  operation 
and  design. 

Although  we  did  not  investigate  the  applicability  and  usefulness  of 
this  heuristic  in  other  nonlinear  optimization  problems  (from  the  net¬ 
working  area  or  from  other  disciplines),  we  suspect  that  it  possesses 
inherent  robustness  properties  that  are  likely  to  make  it  applicable  else¬ 
where  as  well.  We  also  believe  that  our  investigation  yields  further  evi¬ 
dence  that,  in  the  field  of  communication  networks,  there  are  opportu¬ 
nities  for  fertile  use  of  optimization  theory  techniques,  as  observed  in 
[9]. 
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